AITopics | measuring systematic generalization

Collaborating Authors

measuring systematic generalization

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Measuring Systematic Generalization in Neural Proof Generation with Transformers

Neural Information Processing SystemsDec-24-2025, 22:28:04 GMT

We are interested in understanding how well Transformer language models (TLMs) can perform reasoning tasks when trained on knowledge encoded in the form of natural language. We investigate their systematic generalization abilities on a logical reasoning task in natural language, which involves reasoning over relationships between entities grounded in first-order logical proofs. Specifically, we perform soft theorem-proving by leveraging TLMs to generate natural language proofs. We test the generated proofs for logical consistency, along with the accuracy of the final inference. We observe length-generalization issues when evaluated on longer-than-trained sequences. However, we observe TLMs improve their generalization performance after being exposed to longer, exhaustive proofs.

measuring systematic generalization, neural proof generation, transformer, (6 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Natural Language (1.00)

Add feedback

Review for NeurIPS paper: Measuring Systematic Generalization in Neural Proof Generation with Transformers

Neural Information Processing SystemsFeb-12-2025, 00:40:57 GMT

This paper evaluates a trained-from-scratch Transformer language model on an artificial simple-theorem-proving task in a way that helps to highlight and clarify some limitations of this commonly-used architecture. Reviewers found some points in the motivation and in the discussion of results potentially a bit misleading, especially surrounding the connection between this work and natural language, but ultimately formed a consensus that the primary claims of the paper are sound and significant, and that the remaining presentational issues don't undermine that.

measuring systematic generalization, neural proof generation, neurips paper, (1 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Natural Language (0.76)

Add feedback

Review for NeurIPS paper: Measuring Systematic Generalization in Neural Proof Generation with Transformers

Neural Information Processing SystemsFeb-8-2025, 15:31:37 GMT

Summary and Contributions: This paper evaluates how well Transformer language models can generate natural language expressions corresponding to first-order logical proofs, and their answers. Given a dataset of facts (tuples like entity1-relation1-entity2, entity2-relation2-entity3) and a query (entity1-?-entity3), the language model is trained on a sentence representing the facts, the query, a proof, and the answer. The proof is a chain of implications (for example, one step is "since entity1 is in relation1 with entity2 and entity2 is in relation2 with entity3, then entity1 is in relation2 with entity3"). The answer is the missing relation, such as relation2. The model can then be tested by presenting only the prefix of the expressions corresponding to the facts and the query (and perhaps the proof), and predicting the answer. The paper evaluates the ability of Transformer language models to generalize in several settings, determined by the number of relations.

language model, measuring systematic generalization, transformer language model, (6 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Natural Language (1.00)

Add feedback

Measuring Systematic Generalization in Neural Proof Generation with Transformers

Neural Information Processing SystemsJan-16-2025, 22:56:51 GMT

measuring systematic generalization, neural proof generation, transformer, (4 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Natural Language (1.00)

Add feedback

ORCHARD: A Benchmark For Measuring Systematic Generalization of Multi-Hierarchical Reasoning

Pung, Bill Tuck Weng, Chan, Alvin

arXiv.org Artificial IntelligenceNov-27-2021

The ability to reason with multiple hierarchical structures is an attractive and desirable property of sequential inductive biases for natural language processing. Do the state-of-the-art Transformers and LSTM architectures implicitly encode for these biases? To answer this, we propose ORCHARD, a diagnostic dataset for systematically evaluating hierarchical reasoning in state-of-the-art neural sequence models. While there have been prior evaluation frameworks such as ListOps or Logical Inference, our work presents a novel and more natural setting where our models learn to reason with multiple explicit hierarchical structures instead of only one, i.e., requiring the ability to do both long-term sequence memorizing, relational reasoning while reasoning with hierarchical structure. Consequently, backed by a set of rigorous experiments, we show that (1) Transformer and LSTM models surprisingly fail in systematic generalization, and (2) with increased references between hierarchies, Transformer performs no better than random.

arxiv preprint arxiv, operator, transformer, (13 more...)

arXiv.org Artificial Intelligence

2111.14034

Country: Asia > China > Hong Kong (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback